Mining approximate patterns with frequent locally optimal occurrences
نویسندگان
چکیده
We propose a novel frequent approximate pattern mining that suits estimation of occurrence regions. Given a string s, our mining enumerates its substrings that locally optimally match many substrings of s. We show an algorithm for this problem in which candidate patterns are generated without duplication using the suffix tree of s. This problem can be extended to the problem of enumerating approximate frequent subforests of a given ordered labeled tree T . Our mining was applied to the task of extraction of search result records from a web page returned by a search engine, and had good performance for benchmark data sets.
منابع مشابه
REAFUM: Representative Approximate Frequent Subgraph Mining
Noisy graph data and pattern variations are two thorny problems faced by mining frequent subgraphs. Traditional exact-matching based methods, however, only generate patterns that have enough perfect matches in the graph database. As a result, a pattern may either remain undetected or be reported as multiple (almost identical) patterns if it manifests slightly different instances in different gr...
متن کاملRelationship-aware sequential pattern mining: results on medical practise on antibiotic treatment and resistance development
Relationship-aware sequential pattern mining is the problem of mining frequent patterns in sequences in which the events of a sequence are mutually related by one or more concepts from some respective hierarchical taxonomies, based on the type of the events. Additionally events themselves are also described with a certain number of taxonomical concepts. We present RaSP an algorithm that is able...
متن کاملRelationship-aware sequential pattern mining
Relationship-aware sequential pattern mining is the problem of mining frequent patterns in sequences in which the events of a sequence are mutually related by one or more concepts from some respective hierarchical taxonomies, based on the type of the events. Additionally events themselves are also described with a certain number of taxonomical concepts. We present RaSP an algorithm that is able...
متن کاملMining Approximate Frequent Patterns from Graph Databases
Graph analytics is the process of discovering patterns and insights from data that can be modeled as graphs. Algorithms for graph analytics fall into two broad categories : Mining and Management. Graph mining algorithms are often used in graph management and vice versa. In recent times, these algorithms have become an indispensable tool for analyzing networks in domains such as i) Computational...
متن کاملDistributed Discovery of Multi-Level Approximate Process Patterns
Process mining focuses on the discovery of knowledge about a (business) process from a set of its executions stored in an event log. Each event describes an activity and its performer. Process mining techniques allows automatically extracting the process model that gains insight into various perspectives, such as the control flow perspective, data, and organizational perspective. In this paper,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Discrete Applied Mathematics
دوره 200 شماره
صفحات -
تاریخ انتشار 2016